{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Getting started with prompt flow\n",
    "\n",
    "**Prerequisite** - To make the most of this tutorial, you'll need:\n",
    "- A local clone of the prompt flow repository\n",
    "- A Python environment with Jupyter Notebook support (such as Jupyter Lab or the Python extension for Visual Studio Code)\n",
    "- Know how to program with Python :)\n",
    "\n",
    "_A basic understanding of Machine Learning can be beneficial, but it's not mandatory._\n",
    "\n",
    "\n",
    "**Learning Objectives** - Upon completing this tutorial, you should be able to:\n",
    "\n",
    "- Run your first prompt flow sample\n",
    "- Run your first evaluation\n",
    "\n",
    "\n",
    "The sample used in this tutorial is the [web-classification](../../flows/standard/web-classification/README.md) flow, which categorizes URLs into several predefined classes. Classification is a traditional machine learning task, and this sample illustrates how to perform classification using GPT and prompts."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Install dependent packages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install -r ../../requirements.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Create necessary connections\n",
    "Connection helps securely store and manage secret keys or other sensitive credentials required for interacting with LLM and other external tools for example Azure Content Safety.\n",
    "\n",
    "In this notebook, we will use flow `web-classification` which uses connection `open_ai_connection` inside, we need to set up the connection if we haven't added it before. After created, it's stored in local db and can be used in any flow.\n",
    "\n",
    "Prepare your Azure Open AI resource follow this [instruction](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal) and get your `api_key` if you don't have one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "from promptflow.client import PFClient\n",
    "from promptflow.connections import AzureOpenAIConnection, OpenAIConnection\n",
    "\n",
    "# client can help manage your runs and connections.\n",
    "pf = PFClient()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    conn_name = \"open_ai_connection\"\n",
    "    conn = pf.connections.get(name=conn_name)\n",
    "    print(\"using existing connection\")\n",
    "except:\n",
    "    # Follow https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal to create an Azure Open AI resource.\n",
    "    connection = AzureOpenAIConnection(\n",
    "        name=conn_name,\n",
    "        api_key=\"<test_key>\",\n",
    "        api_base=\"<test_base>\",\n",
    "        api_type=\"azure\",\n",
    "        api_version=\"<test_version>\",\n",
    "    )\n",
    "\n",
    "    # use this if you have an existing OpenAI account\n",
    "    # connection = OpenAIConnection(\n",
    "    #     name=conn_name,\n",
    "    #     api_key=\"<user-input>\",\n",
    "    # )\n",
    "\n",
    "    conn = pf.connections.create_or_update(connection)\n",
    "    print(\"successfully created connection\")\n",
    "\n",
    "print(conn)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Run web-classification flow\n",
    "\n",
    "`web-classification` is a flow demonstrating multi-class classification with LLM. Given an url, it will classify the url into one web category with just a few shots, simple summarization and classification prompts."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set flow path"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "flow = \"../../flows/standard/web-classification\"  # path to the flow directory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Quick test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test flow\n",
    "flow_inputs = {\n",
    "    \"url\": \"https://play.google.com/store/apps/details?id=com.twitter.android\",\n",
    "}\n",
    "flow_result = pf.test(flow=flow, inputs=flow_inputs)\n",
    "print(f\"Flow result: {flow_result}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Test single node in the flow\n",
    "node_name = \"fetch_text_content_from_url\"\n",
    "node_inputs = {\n",
    "    \"url\": \"https://play.google.com/store/apps/details?id=com.twitter.android\"\n",
    "}\n",
    "flow_result = pf.test(flow=flow, inputs=node_inputs, node=node_name)\n",
    "print(f\"Node result: {flow_result}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Flow as a function\n",
    "\n",
    "We have also implemented a syntex sugar where you can consume a flow like a python function, with ability to override connections, inputs and other runtime configs.\n",
    "Reference [here](./flow-as-function.ipynb) for more details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from promptflow.client import load_flow\n",
    "\n",
    "flow_func = load_flow(flow)\n",
    "flow_result = flow_func(**flow_inputs)\n",
    "print(f\"Flow function result: {flow_result}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Batch run with a data file (with multiple lines of test data)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = \"../../flows/standard/web-classification/data.jsonl\"  # path to the data file\n",
    "\n",
    "# create run with default variant\n",
    "base_run = pf.run(flow=flow, data=data, stream=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "details = pf.get_details(base_run)\n",
    "details.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Evaluate your flow\n",
    "Then you can use an evaluation method to evaluate your flow. The evaluation methods are also flows which use Python or LLM etc., to calculate metrics like accuracy, relevance score.\n",
    "\n",
    "In this notebook, we use `classification-accuracy-eval` flow to evaluate. This is a flow illustrating how to evaluate the performance of a classification system. It involves comparing each prediction to the groundtruth and assigns a \"Correct\" or \"Incorrect\" grade, and aggregating the results to produce metrics such as accuracy, which reflects how good the system is at classifying the data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Run evaluation on the previous batch run\n",
    "The **base_run** is the batch run we completed in step 2 above, for web-classification flow with \"data.jsonl\" as input."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "eval_flow = \"../../flows/evaluation/eval-classification-accuracy\"\n",
    "\n",
    "eval_run = pf.run(\n",
    "    flow=eval_flow,\n",
    "    data=\"../../flows/standard/web-classification/data.jsonl\",  # path to the data file\n",
    "    run=base_run,  # specify base_run as the run you want to evaluate\n",
    "    column_mapping={\n",
    "        \"groundtruth\": \"${data.answer}\",\n",
    "        \"prediction\": \"${run.outputs.category}\",\n",
    "    },  # map the url field from the data to the url input of the flow\n",
    "    stream=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "details = pf.get_details(eval_run)\n",
    "details.head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "metrics = pf.get_metrics(eval_run)\n",
    "print(json.dumps(metrics, indent=4))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pf.visualize([base_run, eval_run])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By now you've successfully run your first prompt flow and even did evaluation on it. That's great!\n",
    "\n",
    "You can check out the [web-classification](../../flows/standard/web-classification/) flow and the [classification-accuracy](../../flows/evaluation/eval-classification-accuracy/) flow for more details, and start building your own flow.\n",
    "\n",
    "Or you can move on for a more advanced topic: experiment with a variant."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Another batch run with a variant\n",
    "\n",
    "[Variant](../../../docs/concepts/concept-variants.md) in prompt flow is to allow you do experimentation with LLMs. You can set a variant of Prompt/LLM node pointing to different prompt or use different LLM parameters like temperature.\n",
    "\n",
    "In this example, `web-classification`'s node `summarize_text_content` has two variants: `variant_0` and `variant_1`. The difference between them is the inputs parameters:\n",
    "\n",
    "variant_0:\n",
    "\n",
    "    - inputs:\n",
    "        - deployment_name: gpt-35-turbo\n",
    "        - max_tokens: '128'\n",
    "        - temperature: '0.2'\n",
    "        - text: ${fetch_text_content_from_url.output}\n",
    "\n",
    "variant_1:\n",
    "\n",
    "    - inputs:\n",
    "        - deployment_name: gpt-35-turbo\n",
    "        - max_tokens: '256'\n",
    "        - temperature: '0.3'\n",
    "        - text: ${fetch_text_content_from_url.output}\n",
    "\n",
    "\n",
    "You can check the whole flow definition at [flow.dag.yaml](../../flows/standard/web-classification/flow.dag.yaml)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# use the variant1 of the summarize_text_content node.\n",
    "variant_run = pf.run(\n",
    "    flow=flow,\n",
    "    data=data,\n",
    "    variant=\"${summarize_text_content.variant_1}\",  # here we specify node \"summarize_text_content\" to use variant 1 version.\n",
    "    stream=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "details = pf.get_details(variant_run)\n",
    "details.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Run evaluation on the variant run\n",
    "So that later we can compare metrics and see which works better."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "eval_flow = \"../../flows/evaluation/eval-classification-accuracy\"\n",
    "\n",
    "eval_run_variant = pf.run(\n",
    "    flow=eval_flow,\n",
    "    data=\"../../flows/standard/web-classification/data.jsonl\",  # path to the data file\n",
    "    run=variant_run,  # use run as the variant\n",
    "    column_mapping={\n",
    "        \"groundtruth\": \"${data.answer}\",\n",
    "        \"prediction\": \"${run.outputs.category}\",\n",
    "    },  # map the url field from the data to the url input of the flow\n",
    "    stream=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "details = pf.get_details(eval_run_variant)\n",
    "details.head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "metrics = pf.get_metrics(eval_run_variant)\n",
    "print(json.dumps(metrics, indent=4))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pf.visualize([eval_run, eval_run_variant])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Next Steps\n",
    "\n",
    "Learn more on:\n",
    "- [Manage connections](../../connections/connection.ipynb): how to manage the endpoints/secrets information to access external services including LLMs.\n",
    "- [Chat with PDF](../e2e-development/chat-with-pdf.md): go through an end-to-end tutorial on how to develop a chat application with prompt flow.\n",
    "- [Deploy http endpoint](../flow-deploy/deploy.md): how to deploy the flow as a local http endpoint.\n",
    "- [Prompt flow in Azure AI](./quickstart-azure.ipynb): run and evaluate flow in Azure AI where you can collaborate with team better."
   ]
  }
 ],
 "metadata": {
  "description": "A quickstart tutorial to run a flow and evaluate it.",
  "kernelspec": {
   "display_name": "prompt_flow",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.18"
  },
  "resources": "examples/requirements.txt, examples/flows/standard/web-classification, examples/flows/evaluation/eval-classification-accuracy"
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
